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ABSTEACT 

Teaching is an omnibus profession, but each teacher 
is a self-sufficient individual and many yardsticks are needed to 
measure competence in this role. The evaluation of a teacher should, 
in p^nciple, be bound to what students learn and to the attitudes 
and values they hold over the long haul. In practice, however, we 
tend to separate -^aching from learning. Each of the participating 
schools in the N^ ional Project III uses some fo;rm of teacher 
evaluation by st ii^ntsr a'nd from these 'collective experiences it is 
possible to be exp : ^ ^ir about the issues and problems generated by 
such arrangements. : iest: matters range from broad policy questions to 
technical decisions; pertaining to the evaluating instrument and the 
handling of data. The report is a brief summary Of these issues, and 
most of the examples are taken from Fund Associates schools. Sources 
for more detailed information are listed. (Author/MSE) 
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Number II " . 

student Evaluation 
of Teaching 

Teaching is an omnibus'professio.i, but each teacher 
is a self-sufficient individual and many yardsticks are 
needed to treasure competence in this role. The 
• evaluation of a teacher should, in principle, be bound to 
what students learn and to the attitudes and values they 
hold over the long haul. In practice, however, we tend to 
separate teaching from learning— "I did a good job of 
teaching today; but whether my students learned 
anything was up to them." This is not a common 
response but it does illustrate the natural interest of 
teachers toward being evaluated in terms of what they 
do as teachers. 

Each of the participating schools 'n National Project 
III uses some form of teacher evaluation by students 
and from our collective experience we can be quite ex- 
plicit about the issues and problems generated by such 
arrangements. These matters range from broad policy 
questions to technical decisions pertaining to the 
evaluating ins*'ument and the handling of data. The 
present report is a biief- summary of these issues and 
most of our' examples are taken from the hund 
Associates. Interested readers are encouraged to write 
to the Fund Associates in National Project III (see list at 
end of text) for procedural specifics. The projects at 
Purdue' and Kansas State University are especially 
worthy of attention. 

The Institutianai 
Context 

The responsibility of the home institution is to 
evaluate fairly the individual members of the faculty, 
and this is a far more complicated task than the 
development and use of student ratings of teachers. 
Nevertheless, at many schools these procedures are 
not independent events and this section will identify 
some of the larger issues that tie-in student ratings to iri- 
stitutional policies. 

Recognition and promotion on the basis of merit is 
often strongly defended, but this principle probably 
lacks force in many postsecondary institutions. It is not 
particularly difficult to find some good things to say 
about most teachers but unless the dimensions of merit 
have been explicitly set forth, we may have nothing 
more than window dressing for.a seniority system of ad- 
vancement and recognition. Promotion to a tenured 
rank involves a prediction of the career .contribution of 
the teacher to the aims and goals of the institution. What 
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are these institutional values to which the aspiring 
young teacher must conform? 

Institutional Differences — - 

ISSUE 1: To what extent should the 
teachers reflect the priority 
home institution? 

Outstanding teachers share common characteristics 
of excellence, regardless of the type of school. Even so. 
there are real differences in the pressures experienced 
by teachers at different institutions. The statewide 
SUNY* system is examining its procedure for granting 
special awards^^for outstanding teaching a.'^d will deter- 
mine how this m^ode of recognition is perceived by the 
faculties on the different campuses. A close look- at a 
Similar arrangement (special awards to teachers) is un- 
derway at The University of Michigan.* The SUNY- 
Oswego' project is a good exarnple"of combining the 
values and preferences of the individual teacher with 
the standards set by the department chairperson. 

Course-specific Differences 

ISSUE 2: How might an evaluation procedure balance 
institutional needs with the distinctive factors 
in the leaching task of the individual teacher? 

We ^end to talk ^ibout teaching as a genera! skni, but 
fair and valid evaluation requires special attention to the 
specific conditions of subject matter, teacher, student 
characteristics and special conditions affecting the en- 
vironment for learning. These influences derive from 
different combinations of factors from one teacher to the 
next or from one course to another. Most teachers 
accept the accountability principle insofar as they have 
confidence in the crit*^ ia/and the measures used for 
evaluating their performance. Their sensitivity to these 
matters is quite legitimate. 

Care must be taken to establish the pertinent criteria 
for each instructional setting and to judge the teacher 
within this context. The clarity and relevance of the 
teacher's course objectives, for example, shouir ^ irry 
considerable weight in teacher evaluation, as should the 
ability to organize course content into a productive 
hierarchy and to assess student performance in a 
manner that supports rather than hinders learnirrg. A 
good teacher must be able to provide instructional 
materials relevant to the objectives of the course,' to 
tutor, to counsel, to excite students and, finally, to serve 
as an exemplar or model for the attitudes and values 
germane to a particular area of research, teaching and 
public service. These are some of the dimensions of 
good teaching but each is manifest in a distinctive way 
by the idiosyncratic teacher/course combination. 

*5ee Criteria I for summaries of the different Fund Associate" 
prograr^is referred to iri the present report, Request copies from: 
Center for Research on Learning and Teaching. University of 
IVIichigan, 109 E. I^ndison. Ann Arbor. M\ 48109. 



Teaching and/or Research and Servitfe? 

ISSUE 3: How much weight is assigned to the evaiua- 
lion of teaching in concert with the other con- 
tributions of a faculty member? 
There is a difference between the rather specific 
responsibilities of a classroom teacher and his or her 
broader functions as a' member of the faculty. 
Institutional recognition ofteti derives from the more visi- 
ble activities of committee work, administrative respon- 
sibilities, scientifjc and scholarly engagement, 
publications, community services, leadership in 
professional orgari.zations. and the many other ac- 
tivities that gain attention and. favorable reaction from 
the larger community. The effect of these activities may 
or may not contribute to the quality of instruction receiv- 
ed by students in the classroom. What is best for the in- 
stitution or the teacher's professional development is. in 
the long run. usually best for the students, but in the 
meantime, certain aspects oi clai^srocm teaching are 
important for the here-and-now student. 

Good teaching is not necessarily correlated, plus or 
minus, with conforrnity to administrative criteria. Deans, 
teachers, and students each view the educational scene 
from their own vantage point and a fair system f.or facul- 
ty evaluation would be to openly examine these criteria. 
A distinctive feature of the project at the University of 
Illinois* IS to elicit opinions of students toward their 
educational program— tlieir "ma)or." How, for example, 
do majors in chemistry regard -he undergraduate 
procjram they experience? Faculty judgments arc also 
obtained in this rather intensive analysis of the quality of 
a particular sequence or pattern of courses. Information 
from each of the several sources is weighed by a 
specially appointed task force responsible for the 
review and the development of recommendations. 

The Dimznsians 
for Ei^iicrtian. 

• An experienced teacher will usMi^ny fmrj that the 
average overall rating received from students docs not 
change dramatically from term to term. Greater atten- 

■ tion. therefore. is given by the teacher to those question- 
naire Items which provide specific diagnostic informa- 
tion about particular features of a course. These data 
,are probably more useful as a means for helping a 
•teacher improve a course than as a source ol evidence 
■ - for purposes of merit recognition andxpromotion. 

Evaluating Content and/or Method ' 

ISSUE 4: By what means might the rating scale 
" separate course evaluation from' the per- 
j sonal style of the teacher?,. 

• A distinction must be made between evaluating the 
f teacher as a person and the course as an organized 
' program of study. These are not. of course, indepen- 
\ dent factors since a dull teacher can destroy an 

otherwise exciting body of knowledge and a 
chansmatic teacher can breathe life into dreary text- 
book knowledge. A ratmg scale must do more than 
scale a teacher's "popularity" since these happiness 
scores may be quite unrelated to the educationa' impact 
of a course. Nevertheless, it is not at all uncommon tor a 
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student to like "the course" better than "the teacher" 
and these discriminations should not be obscured by 
the rating instrument. 

Judgments by Peers and Supervisors 

ISSUE 5: Are the factors. best evaluated by peers and 
supervisors cle'a'iy distinguished from 
dimensions best judged by students? 
The specific task of teaching is only one component 
in the full inventory of faculty responsibilities. Attention 
to housekeeping chores, for example, might be impor- 
tant to administrators and to colleagues, but whether or 
not the classroom teacher performs these logistical 
duties neatly and on time is of little immediate conse- 
quence to students. A teacher's reputation takes shape 
among his or her peers from the accumulation of in- 
cidents and comments during the normal course of 
departmental and institutional affairs. One's classroom 
stylQ may not be known or given much weight by fellow 
faculty members who sense there is no single model for 
good teaching. 

The criteria for successful teaching are not posted on 
a bulletin /board or encoded in a set of bylaws. These 
standards grow and take form as traditions /of the 
department develop and accommodate to the 
necessary ^variations in teaching style. Academia 
treasures individuality, but it takes courage, evon so. to 
march to a different beat than the one given by the 
dean, Jhe depart nent chairperson, or the power struc- 
ture wi'thin the department. After National Project III has 
moved further along, we will prepare a report to analyze 
the evaluation of teaching by one's self, by peers, and 
by those who administer an educational program. 

student Ecrtings 
of TecEchieis 

The remainder of Criteria II will deal almost exclusive- 
ly With student ratings of teachers. Teachers usually 
want to know how their students evaluate the main 
features' of a course and the way it was taught. If these 
ratings are obtained within a climate of cooperation and 
mutual respect, they are a valuabki/ource of informa- 
tion about the quality of instruction. We will outline the 
considerable research and development activity in the 
area of student ratings and will indicate at least some of 
the problems and issues. References to specific studies 
will bo omitted since these can be found in the 
publications cited in the Bibliography. 

Student Purposes in Evaluation 

ISSUE 6: Do students make accurate observations of 
those features of a course that are s.gnificant 
to them? 

The freedom for students to elect different courses 
means little if choice is basod on trivial information^ As 
"consumers" they want to know a great deal more'^about 
a course man the content area it covers. Questionnaire 
data can indicate what students judge to be important: 
instructional obiectives. flexibility to pursue particular 
topics, how class time is spent, the supppcting 
resources for instruction, evaluation procedures, 
grading standards, frequency and nature of tests, and 
finally those idio'^.yncratic'characteristics of the course 
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and the teacher tnat might make some difference to 
some students 

Mandatory or Voluntary Use of Rating 
Instruments. 

iSSUg 7. What schedule of student ratings (maiidatory 
or voluntary; every class or some classR?^) 
give? optimum results? 

The matter of "overexposure" to a specific rating 
system is important. Basically, this comes down to the 
question ot mandatory versus voluntary use of the 
evaluation procedure on the part of the teacher, For an 
evaluation system to remain effective, students must be 
willing to give carefully considered opinions. We might 
expect the average quality of responses to decline con- 
siderably if all classrooms, every term, are saturated 
with the requirement to complete the rating forms. As a 
compromise, various intermittent schedules could be 
established but. in any case, decisions as to the fre- 
quency ot questionnaire distribution should serve the 
best interests of the teachers. 

ISSUE 8: .What is the primary purpos:! of the rating in- 
strument: evaluation and/or diagnosis? 

Tne specific items used in the rating form must be 
c insistent with the purpose for wf;iich the results are in- 
jeiided — information for course selection by students or 
diagnostic analysis for the teacher. The data that go 
forv;ard through ?*dministrative channels for merit 
review may require yet a further set of questionnaire 
itern^: This selective use of items is an extremely impor- 
tant matter. When a teacher seeks diagnostic informa- 
tion, ho or she might select questions by which students 
couid poim out the weaker aspects of a course or 
teaching metnud. If these results are then used for merit 
revidv^. the evaluative system is working at cross pur- 
poses. It given a chotco ot questions tor an ad- 
ministrative asf..ossmeni. a teacher will tend to 
ernphasi;^e known strong points and perhaps in- 
advertenny gloss over inadequacies that might damage 
his or her teaching image. If the teacher feels student 
ratings do not give an accurate reflection of his or her 
inr,tructtona! plan and performance, it would hardly 
seem appropriate to forward these findings as eviderfte 
of professiOfiat competence. An inflexible or highly 
prescribed evaluation system can penaliee the inventive 
or unconventional teacher. Such systems tend to con- 
verge teaching styles: to reward conformity to a 
preestablished template as to what is good teaching. 

A major development in the current technology of 
student ratings is to include only a few compulsory 
items for evaluating the global or general characteristics 
of the teacher and the course. The following "core" 
i^ems are included m the CAFETERIA system under 
current development and use at Purduf IJniversity:* 

This instructor motivates me *o do my best work. 

Course assignments are. interesting and stimulating. 

This instructor explains difficult material clearly. 

Overall, ithis course is among the best I have ever 

taken. 

Overall, ihis instructor is among the best teachers 1 
have known. 

The bulk of each CAFETERIA instrument, however, 
consists of tems selected by the individual teacher from 
a "catalog" of 200 or more items. The instructor can 
select up to 40 items (some of which can be self- 



constructed) i-^fejring to particular aspects of the 
course for which feedback is desired. This capaLility 
not only adapts the instrument to a variety of courses 
and teaching styles but it. involves the teacher in a 
process he or she can shape or influence. Flexibility 
appears to be a major factor in gaining faculty accep- 
tance and adoptions of CAFETERIA services. 

The IDEA (Instructional Development and Effec- 
tiveness Assessment) system at Kansas Stale Univer- 
sity* allows the teachers to identify a unique profile of 
objectives from a list of 10 different statements. 
Students rate their progress toward these aims 'n com,- 
parison to other classeij. They also evaluate the Instruc- 
tor (20 iterTis),"the course demands (4 items), and com- 
plete Jive "self-rating" items plus eight demographic- 
type questions, The instructor receives a detailed report 
giving the frequency distribution of responses for all 
items and can diMerentiate the findings in terms of the 
best match between the objectives of the course, the 
Size of the class, and particular instructional ap- 
proaches employed. One of the more distinctive, 
features o1 the KSU arrangement is its carefully worked 
out system of reference points; sets of norms which 
allow the teacher to take into account, for examp'e. five 
different levels of student motivation and four different 
class sizes. 

Developing a Useful Yardstick 

ISSUE 9: What is the role of the measurement/evalua- 
tion specialist in the development of local 
evaluation instruments? 
"Judge not. lest ye be judged." This admonition has 
n'jw gone full circle and teachers, who have beenpass- 
ing judgement on student' performance for countless 
years, are now being evaluated by students. Unfor- 
tunately, the quality of the measuring instruments is un- 
even E>^amlnation and testing bureaus help faculty to 
develop discriminating procedures for evaluating 
students, but teachers are frequently "graded" by un- 
reliable homemade instruments. Considering the com- 
plexity and the not-too-subtle threats of student rating 
systems, it is ^mandatory for the institution to develop 
^vocedures that meet at least minimum standards of 
jnsi'toncy. accuracy, and fairness to individual 
teachers. 

I; IS easier to measure height and weight than to 
assess intelligence or subject-matter knowledge. It is 
even more difficult to assess the interaction between 
teacher and students. The nature of the task requires 
that each rating form.be capable of reflecting func- 
tionally relevant characteristics of a giveri teache'' in a 
given course ai a given institution, with the 
foreknowledge that the. perception of these conditions 
wjII differ widely among students. The teacher needs no 
technical consultant to know how to ask students if they 
enjoyed the course. !f. however, the questionnaire 
becomes rather complex and if certain quantitative 
treatments are to be applied to students' responses, 
e.g.. norms, percentile ranks, etc.. the teacher, the 
department, and the college are advised to obtain some 
guidance from persons knowledgeable about the 
several alternatives and pitfalls of such procedures. 

The Main Factors to Which Students Respond 

ISSUE 10: Does the choice of times cover the main 
factors in instruction? 
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Over the years literally thousands of different items 
have been included in teacher rating forms. By means 
of a rather sophisticated statistical treatment— factor 
analysis— it is possible to determine which of tjiese 
items seem to cluster together, i.e.. which ones tend to 
measure a common dimension or feature o\ instruction. 
The four principal factors seem to be: 

(1) Skill. This is the most powerful general factor 
since over half the typical rating form items have sub- 
stantial loadings in this factor while less than a tenth of 
the items can be placed in any one of the other clusters 
or categories. As far as the instructor is concerned, the 
skill factor is the most imponant dimension to be 
assessed by students. A sample question would be: 
The instructor gives clear explanations. 

(2) Rapport, e.g.. The instructor treats students with 
respect. 

.(3) Organization, e.g., The instructor gses class time 
well. 

(4) Overload/Difficulty, e.g.. The instructor has made 
the course sufficiently difficult to be stimulating. 

As mentioned on page 1. the true measure of the 
teacher is the impact on students A good rating scale 
should, therefore, include items which enable the 
students to indicate, in various ways, the impact value of 
the course. Given a free choice, the teacher may select 
more items aimed at the "'teaching" than the "learning" 
(impact) side of things. 

Norms 

ISSUE 11: Are the available norms applicable and fair 
• to each of the different teacher-course 
combinations? 

On the face of it, the interpretation of student ratings 
would seem to be more meaningful if students' 
responses could be compared to established norms. 
There are, nevertheless, some problems in this arrange- 
ment. A normative comparison must be compatible with 
the situation-specific characteristics of a given teacher 
and a given course. If the use of a student rating scale Is 
voluntary on the part of the teachers, it is Questionable 
that "institutional" norms should be developed from a 
self-selected sample of the faculty. 

The research findings show that student ratings show 
a "halo effect," that is. more often than not, students 
seem to like their teachers and this "bias" shows up 
when their ratings are averaged. If these ratings are then 
statistically transformed into a '.'normalized" frequency 
distribution, the teacher receives a somewhat distorted 
score since half of the teachers who contribute to these 
norms will he placed "below average." The straight- 
forward use of norms at KSU is simply to present the 
frequency distribution and the teacher can then make a 
•direct comparison of his or her ratings with those of 
other members of the faculty, who teach comparable 
courses. 

The value of student ratings is increai^ed if the in- 
structor will focus attention on specific items and on 
patterns revealed across item responses rather than try- 
ing to derive a gross "teaching index" score. If the es- 
tablished norms are linnited to -a total score, they may 
have the effect of pressing individual members of the 
faculty to teach in wayi that are calculated to yield a 
"high grade/' This is directly comparable'td the com- 
petitive misdirections so frequently seen when students 
work for grades rather than to acquire and to unders- 



tand a body of knowledge, (f a teacher wants to know 
how he or she stands overall, simply ask two questions 
of students: 

1. How do you rate this course, overall, in com- 
parison with other courses you have taken?- 

2. How do you rate this teacher in comparison with 
other teachers you have had? 

Follow-up 

ISSUE 12: Can teachers pull themselves up by their 
o^7n bootstraps? 

Where can a teacher— as a teacher-^go for help? 
We have never heard. of a pedagogical crisis center for 
college professors and most of us make quite a point of 
hiding the troubles we have with our classes. Good 
teaching is taken for granted and most institutions simp- 
ly have not found" it necessary to establish counseling 
mechanisms to assist theiroubled teacher— other than 
the department chairperson, one's spouse, or Kelsey's 
Bar. However, it is perfectly sensible to seek information 
as to how best to interpret rating-scale response. This 
does not raearTthat the teacher is "in trouble." 

One of the better examples of a follow-up ^rvice is at 
Kansas State University where a knowledgeable second 
person helps to guard teachers against drawing false 
conclusions from rating data, to resolve conflicts and to 
choose, among alternative types of corrective action. 
• The KSU follow-up arrangement has become the main 
gateway to the larger program of faculty development. 

Validity 

ISSUE 13: Does feedback from students bring about 
significant changes in the classroom perfor- 
mance of a teacher? 
The research evidence is, again, inconclusive. ■ A 
teacher who wants to know the reaction of students to 
various features of a course will certainly be sensitive to 
the information received; it has salience and immediate 
validity. Upon receiving a set of completed ratings from, 
a course, teachers will frequently tally, examine, fume, 
and puir. Even without' reference to external norms, it 
^^iay be apparent to the teacher that he or she is still not 
skilled, for example, in the management of group dis- 
cussion. On the other hand, the teacher may be pleas- 
ed to find that certain new features of the course vre 
well received by students. This is ^altd informal' 

The one-shot set of questionnaire returns is le':-. 
valuable to the teacher or to the administration than t z 
accumulated ratings over time. This more stabk; 
average is a better indicator of a teacher's characteristic 
strengths and weaknesses but whether the teScher can 
do much about correcting deficiencies is quite another 
question. Knowing one is overweight does not lead 
automatically to weigh* loss; the quiet science professor 
is unlikely to become a charismatic spellbinder simply 
because this style seems preferred by students. After 
five or more years of teaching, it is difficult to cnange 
or.o's habits of speaking, to become more or less the 
authoritarian teacher, to be more receptive to contrary 
student opinions, to relax one's standards for grading, 
and so on. Neverthele:^s. most teachers will at least try 
to reduce the dissonance between their teaching habits 
and how these seem to be perceived by students. 
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The present statement notes current developments 
"toward better procedures for evaluating teachers, es- 
pecially as judged by students, .However, no paper- 
and-pencil--iosirument yet devised can do complete 
justice as an evarjatirig "procedure for the college 
teacher. Knowing the strengths and the limitations of 
these formal arrangements is one important step for 
guarding against their misuse. . 
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Activity which is the. subject of this report was sup- 
ported in whole or in part by the Fund for the Improve- 
ment of Postsecondary Education. Department of 
Health. Education and Welfare. However, the opinions 
expressed herein do not necessarily reflect the position 
or policy of the Fund and no official endorsement by 
that agency should be inferred. 
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omissions and the views herein presented. SCE 



